Length of green season from cloud-filtered Landsat SR

This takes as input a time-series csv file generated by Google Earth Engine of cloud-filtered Landsat Surface Reflectance NDVI from Landsats 5, 7, and 8 on a geometry such as point or polygon. The getLogs function counts the days between the first and last observations above a threshold value, for each year of data. Thus for winter growing seasons this will not be adequate.

Data can be sparse for some years due to cloudiness or missing Landsat data.



In [1]:

    
import pandas as pd, numpy as np
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
%matplotlib inline

get the data

from Earth Engine. For example, this script generates a time series from Landsat 5, 7, and 8 Surface Reflectance products on plot GBRO1 (pasture) or GBRO2 (cropland) in North Dakota.

https://code.earthengine.google.com/070a15b53111ad808b01e0ace1de433b

plot locations:

GBRO1: https://atlasbiowork.com/sites/615

GBRO2: https://atlasbiowork.com/sites/616



In [2]:

    
#function to count days above threshold NDVI from time series data
def getLogs(self, th): #args: dataframe, threshold NDVI value
    df=self
    df['next']=df.nd.shift(-1)
    df['prev']=df.nd.shift(1)
    #find first above threshold
    df['first'] = ((df.nd>=th)&((df.prev<th)|(df.prev.isnull())))
    #find last above threshold
    df['last'] = ((df.nd>=th)&((df.next<th)|(df.next.isnull())))
    #credit 16 days for single observations above threshold
    singles = df[(df['first']==True) & (df['last']==True)].nd.count()*16
    #now remove these
    df = df[~((df['first']==True) & (df['last']==True))]
    #remove all but first and last
    df = df[((df['first']==True) | (df['last']==True))]
    #get intervals between first and last
    df['nextdate'] = df.date.shift(-1)
    df['inc'] = (df['nextdate']-df['date']).dt.days #increment in days
    return int(df[df['first']==True].inc.sum()+singles)

Crop field plot



In [3]:

    
df = pd.read_csv('/Users/Peter/Downloads/gbro2.csv')
threshold = .25
location= 'Brown Ranch crop field, GBRO2'



In [4]:

    
df['date'] = pd.to_datetime(df['system:time_start'])
df.index = df['date']
del df['system:time_start']
df = df.dropna() # drop the rows without observations (masked)
df = df[df['date'].dt.year<2018] #chop off 2018 which EE didn't do
logs = df.groupby(df['date'].dt.year).apply(getLogs,th=threshold)
count = df.groupby(df['date'].dt.year).agg({'nd':'count'})
df = logs.to_frame().join(count)
df.columns = ['days of green','number of observations']
df['rolling mean']=df['days of green'].rolling(5).mean()
df









    Out[4]:






  
    
      
      days of green
      number of observations
      rolling mean
    
    
      date
      
      
      
    
  
  
    
      1985
      80
      19
      NaN
    
    
      1986
      64
      16
      NaN
    
    
      1987
      160
      15
      NaN
    
    
      1988
      32
      15
      NaN
    
    
      1989
      112
      11
      89.6
    
    
      1990
      16
      3
      76.8
    
    
      1991
      48
      13
      73.6
    
    
      1992
      112
      12
      64.0
    
    
      1993
      32
      8
      64.0
    
    
      1994
      80
      8
      57.6
    
    
      1995
      80
      10
      70.4
    
    
      1996
      151
      16
      91.0
    
    
      1997
      121
      16
      92.8
    
    
      1998
      80
      17
      102.4
    
    
      1999
      104
      30
      107.2
    
    
      2000
      120
      30
      115.2
    
    
      2001
      128
      30
      110.6
    
    
      2002
      88
      27
      104.0
    
    
      2003
      120
      22
      112.0
    
    
      2004
      208
      40
      132.8
    
    
      2005
      247
      39
      158.2
    
    
      2006
      120
      31
      156.6
    
    
      2007
      176
      26
      174.2
    
    
      2008
      112
      28
      172.6
    
    
      2009
      200
      28
      171.0
    
    
      2010
      80
      24
      137.6
    
    
      2011
      192
      27
      152.0
    
    
      2012
      64
      16
      129.6
    
    
      2013
      160
      25
      139.2
    
    
      2014
      168
      30
      132.8
    
    
      2015
      184
      31
      153.6
    
    
      2016
      160
      32
      147.2
    
    
      2017
      232
      29
      180.8



In [5]:

    
df.plot(figsize=(15,10), grid=True, lw=5, title='Days of Landsat NDVI above '+str(threshold)+' at '+location)









    Out[5]:





<matplotlib.axes._subplots.AxesSubplot at 0x14bcabf49e8>

Pasture plot



In [6]:

    
df = pd.read_csv('/Users/Peter/Downloads/gbro1.csv')
threshold = .25
location= 'Brown Ranch pasture, GBRO1'



In [7]:

    
df['date'] = pd.to_datetime(df['system:time_start'])
df.index = df['date']
del df['system:time_start']
df = df.dropna() # drop the rows without observations (masked)
df = df[df['date'].dt.year<2018] #chop off 2018 which EE didn't do
logs = df.groupby(df['date'].dt.year).apply(getLogs,th=threshold)
count = df.groupby(df['date'].dt.year).agg({'nd':'count'})
df = logs.to_frame().join(count)
df.columns = ['days of green','number of observations']
df['rolling mean']=df['days of green'].rolling(5).mean()
df









    Out[7]:






  
    
      
      days of green
      number of observations
      rolling mean
    
    
      date
      
      
      
    
  
  
    
      1985
      144
      19
      NaN
    
    
      1986
      80
      16
      NaN
    
    
      1987
      128
      14
      NaN
    
    
      1988
      80
      15
      NaN
    
    
      1989
      64
      13
      99.2
    
    
      1990
      0
      4
      70.4
    
    
      1991
      0
      13
      54.4
    
    
      1992
      48
      12
      38.4
    
    
      1993
      32
      9
      28.8
    
    
      1994
      80
      8
      32.0
    
    
      1995
      80
      10
      48.0
    
    
      1996
      167
      17
      81.4
    
    
      1997
      160
      17
      103.8
    
    
      1998
      176
      15
      132.6
    
    
      1999
      200
      29
      156.6
    
    
      2000
      160
      31
      172.6
    
    
      2001
      200
      32
      179.2
    
    
      2002
      136
      27
      174.4
    
    
      2003
      120
      25
      163.2
    
    
      2004
      176
      38
      158.4
    
    
      2005
      256
      36
      177.6
    
    
      2006
      136
      32
      164.8
    
    
      2007
      192
      28
      176.0
    
    
      2008
      176
      26
      187.2
    
    
      2009
      184
      27
      188.8
    
    
      2010
      200
      29
      177.6
    
    
      2011
      192
      35
      188.8
    
    
      2012
      192
      19
      188.8
    
    
      2013
      192
      26
      192.0
    
    
      2014
      248
      34
      204.8
    
    
      2015
      224
      36
      209.6
    
    
      2016
      232
      34
      217.6
    
    
      2017
      232
      28
      225.6



In [8]:

    
df.plot(figsize=(15,10), grid=True, lw=5, title='Days of Landsat NDVI above '+str(threshold)+' at '+location)









    Out[8]:





<matplotlib.axes._subplots.AxesSubplot at 0x14bcab486d8>

	days of green	number of observations	rolling mean
date
1985	80	19	NaN
1986	64	16	NaN
1987	160	15	NaN
1988	32	15	NaN
1989	112	11	89.6
1990	16	3	76.8
1991	48	13	73.6
1992	112	12	64.0
1993	32	8	64.0
1994	80	8	57.6
1995	80	10	70.4
1996	151	16	91.0
1997	121	16	92.8
1998	80	17	102.4
1999	104	30	107.2
2000	120	30	115.2
2001	128	30	110.6
2002	88	27	104.0
2003	120	22	112.0
2004	208	40	132.8
2005	247	39	158.2
2006	120	31	156.6
2007	176	26	174.2
2008	112	28	172.6
2009	200	28	171.0
2010	80	24	137.6
2011	192	27	152.0
2012	64	16	129.6
2013	160	25	139.2
2014	168	30	132.8
2015	184	31	153.6
2016	160	32	147.2
2017	232	29	180.8